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'^ Abstract 

O . We consider the multitasking associative network in the low-storage 

,^^ limit and we study its phase diagram with respect to the noise level T 

and the degree d of dilution in pattern entries. We find that the sys- 
>0 tem is characterized by a rich variety of stable states, among which pure 

'""' states, parallel retrieval states, hierarchically organized states and sym- 

^.^ metric mixtures (remarkably, both even and odd), whose complexity in- 

(~| creases as the number of patterns P grows. The analysis is performed 

r^ . both analytically and numerically: Exploiting techniques based on par- 

tial differential equations, allows us to get the self-consistencies for the 
order parameters. Such self-consistence equations are then solved and the 
solutions are further checked through stability theory to catalog their or- 
ganizations into the phase diagram, which is completely outlined at the 
end. This is a further step toward the understanding of spontaneous par- 
allel processing in associative networks. 
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S 1 Introduction 

1—^ The paradigm, introduced almost three decades ago by Amit, Gutfreund and 

Sompolinsky [I] [5] , of analyzing neural networks through techniques stemmed 
►^ from statistical mechanics of disordered systems (mainly the celebrated Replica 

QQ Trick j3] for the Hopfield model [1]) has been so prolific that its applications went 

00 far beyond Artificial Intelligence and Robotics, overlapping Statistical Inference 

■^ [5] , System Biology [5] , Financial Market planning [7j , Theoretical Immunology 

'^ [5j and much more. 

^^ As a result, research in this field is under continuous development, ranging from 

^^ the diverse applications outlined above, to an ever deeper understanding of 

the core-theory behind. For the sake of reaching results closer to experimental 

neuroscience outcomes, scientists involved in the field tried to bypass the rather 

crude mean field description of a fully connected network of interacting neurons, 

embedding them in diluted topologies as Erdos-Renyi graphs IH], small- worlds 

5-H [lOj or even finitely connected graphs |11) . The main point was showing ro- 

^ bustness of the mean-field paradigm even in these diluted, and in some sense 

'closer to biology" , versions and this was indeed successfully achieved (with the 



cn 



X 



*Universita di Parma, Dipartimento di Fisica and INFN Gruppo di Parma, Italy 
tSapienza Universita di Roma, Dipartimento di Fisica and GNFM Gruppo di Roma, Italy 
■fSapienza Universita di Roma, Dipartimento di Matematica, Italy 
§Sapienza Universita di Roma, Dipartimento di Matematica, Italy 



exception of too extreme degrees of dilution, where the associative capacities of 
the network trivially break down). 

Recently, a mapping between Hopfield networks and Boltzmann machines 
[l2l [13] allowed the introduction of dilution into associative networks from a dif- 
ferent perspective with respect to standard link removal a la Sompolinsky "^ or 
a la Coolen [101 |TT]. In fact, while in their papers these authors perform dilution 
directly on the Hopfield network, through the equivalence with Boltzmann ma- 
chine, one may perform link dilution on the Boltzmann machine and then map 
back the latter into the associative Hopfield-like network checking for its emerg- 
ing properties [H] . Remarkably, the resulting model still works as an associative 
performer, as the Hebbian structure is preserved, but its capabilities are quite 
different from the standard scenario. In particular, the resulting associative net- 
work may still be fully-connected but the stored patterns of information display 
entries which, beyond coding information through digital values ±1, can also 
be blank |14l I15j . In fact, any missing link in the bipartite Boltzmann machine 
corresponds to a blank entry in the related pattern of the associative network. 
Now, while standard (i.e., performed directly on the Hopfield network) dilu- 
tion does not change qualitatively the system performances, the behavior of the 
system resulting from hidden (i.e., performed on the underlying Boltzmann ma- 
chine) dilution becomes 'multitasking" because retrieval of a single pattern, say 
^^, does not exhaust the whole neurons, and the ones coupled with the ^^ blank 
entries are free to align with ^^, whose entries will partially be blank as well, 
hence eliciting, in turn, the retrieval of ^^ and so on up to a parallel logarithmic 
(with respect to the volume of the network TV) load of all the stored patterns. 
As a consequence, by tuning the degree of dilution in the hidden Boltzmann 
network and the level of noise in the directed network, the system exhibits a 
very rich phase diagram, whose investigation is the subject of the present work. 

The paper is organized as follows. In section 2, we review the multitasking 
networks introduced in |14j highlighting their main features and providing a 
rigorous solution for their thermodynamics through a novel technique based 
on mapping the statistical mechanical problem into a diffusion problem and 
then solving the latter through standard partial differential equation methods. 
In section 3 solutions obtained in the previous section are investigated. In 
particular, we discuss the emergence of spurious states for these multitasking 
networks. Then, in section 4 we describe the analytical technique used to study 
the stability of the retrieval states, which are found to be solutions of the system. 
Exact analytical investigations and numerical results are presented in section 
5 and a very rich phase diagram, where different emergent behaviors in the 
organization of the neural states are proved. Finally, section 6 is devoted to a 
summary and a discussion of the results which are successfully checked against 
Monte Carlo simulations. 



2 The multitasking associative network 

In the conventional Hopfield model (see, e.g., [HIH]), one considers a network of 
N neurons, where each neuron tXi can take two states, namely, ai = +1 (firing) 
and (Ti = — I (quiescent). Neuronal states are given by the set of variables 
a — (cti, ..., CTjv)- Each neuron is located on a node of a complete graph and the 
synaptic connection between two arbitrary neurons, say, ai and Uj, is defined 



by the following Hcbb rule [T]: 

p 

iV 



where ^^ = (^i,...,^jv) denotes the set of memorized patterns, each specified 
by a label /i = 1, ..., P- The entries are dichotomic, i.e., ^f G {+1, —1}, chosen 
randomly and independently with equal probability, namely, for any i and ^, 

P(0 = ^(%-i+%+i), (2) 

where the Kronecker delta S^ equals 1 iff a; = 0, otherwise it is zero. Patterns 
are assumed as quenched, that is, the performance of the network is analyzed 
keeping the synaptic values fixed. 

The Hamiltonian describing this system is 

N N N,N P 

2 — 1 iyj — 1 ^,i — 1 fJ'—l 

SO that the signal (i.e. the field) acting on neuron i is 

N 

hi{cy,C)^^Jij(^]- (4) 

The evolution of the system is ruled by a stochastic dynamics, according to 
which the probability that the activity of a neuron i assumes the value Ui is 

P(a,;a,^,/?) = ^[1 + tanh(/?/i,a,)], (5) 

where /3 tunes the level of noise such that for /? — > the system behaves com- 
pletely randomly, while for /3 — > oo it becomes noiseless and deterministic; notice 
that the noiseless limit of Eq. (Is]) is ai{t + 1) = sign [ft.i(i)]. 

The main feature of the model described by Eqs. (Is]) and (jsl) is its ability 
to work as an associative memory. More precisely, the patterns are said to 
be memorized if each of the network configurations (Ji = ^f for i = l,...,iV, 
for everyone of the P patterns labeled by /i, is a fixed point of the dynamics. 
Introducing the overlap m^ between the state of neurons a and one of the 
patterns ^^, as 

1 1 ^ 

such a pattern is said to be retrieved if, in the thermodynamic limit, ni^ = 0(1). 
Given the definition (l6]), the Hamiltonian (|3| can also be written as 

p 
HN{,(T,^)^~N^{m,^'f+P = -Nm^ + P. (7) 



The analytical investigation of the system is usually carried out in the ther- 
modynamic limit N ^ oo, consistently with the fact that real networks are 
comprised of a very large number of neurons. Dealing with this limit, it is con- 
venient to specify the relative number of stored patterns, namely P/N and to 
define the ratio a = lim^v-foo P/N . The case a = 0, corresponding to a number 
P of stored patterns scaling sub-linearly with respect to the amount of perform- 
ing neurons N, is often referred to as "low storage". Conversely, the case of 
finite a is often referred to as "high storage" . In particular, in the former case 
(a — 0), the overall behavior of the standard Hopfield model is ruled only by 
the noise T = 1/(3 and the so-called pure-state ansatz 

m = (m,0,...,0), (8) 

always corresponds to a stable solution for T < 1; the order in the entries 
is purely conventional and here we assume that the first pattern is the one 
stimulated. 

Let us now move on and generalize the system described above in order to 
account for the existence of blank entries in the patterns ^'s. More precisely, we 
replace Eq. ^ by 

PiO = ^^i^-i + ^^e:+i + dSif, (9) 

where d encodes the degree of "dilution" in pattern entries. Patterns are still 
assumed as quenched and, of course, the definitions of the Hamiltonian ([3]) and 
of the overlaps (l6|, with the dynamics provided by (Isl) still hold. 

As discussed in [U [121 HZJj this kind of extension has strong biological 
motivations and also yields highly non-trivial thermodynamic outcomes. In 
fact, the distribution in Eq. (pi) necessarily implies that the retrieval of a unique 
pattern does employ all the available neurons, so that no resources are left for 
further tasks. Conversely, with Eq. ^ the retrieval of one pattern still allows 
available neurons (i.e., those corresponding to the blank entries of the retrieved 
pattern), which can be used to recall other patterns up to the exhaustion of 
all neurons. The resulting network is therefore able to process several patterns 
simultaneously. 

In particular, in the low-storage regime, it was shown both analytically (via 
density of states analysis) and numerically (via Monte Carlo simulations) [HI 
ITTI , that the system evolves toward an equilibrium state where several patterns 
are simultaneoussly retrieved. In the noiseless limit T = and for d not too 
large, the equilibrium state is characterized by a hierarchical overlap 

m= (l-rf)(l,d,d2,...,0), (10) 

hereafter referred to as "parallel ansatz". On the other hand, in the presence 
of noise or for large degrees of dilution in pattern entries, this state ceases 
to be a stable solution for the system and different states, possibly spurious, 
emerge. Aim of this work is to highlight the equilibrium states of this system 
as a function of the parameters d and T, and finally build a phase diagram; to 
this task we develop, at first, a rigorous mathematical treatment for calculating 
the free energy of the model and then obtain the self-consistencies constraining 
the phase-diagram; then, we solve these equations both numerically and with 
a stability analysis. In this way we are able to draw the phase diagram, whose 



peculiarities lie in the stability of both even and odd mixture of spurious states 
(in proper regions of the parameters) and the formation of parallel spurious 
state. Both these results generalize the standard counterpart of classical Hop- 
field networks. 

Findings are double-checked through Monte Carlo runs that are in excellent 
agreement with the picture we obtained. 

2.1 Statistical mechanics analysis through Fourier tech- 
nique 

We solve the general model described by the Hamiltonian (l3|, with patterns 
diluted according to Q, in the low storage regime P ^ ^ogN, such that the 
limit a = limAf^.oo P/N — holdsr] Due to the formal analogy with statistical- 
mechanics models for magnetic systems [I] , in the following neurons will be also 
referred to as spins. 

As standard in disordered statistical mechanics, we introduce three types 
of average for an average observable o{<j,^): i. the Boltzmann average uj{o) — 
EaO(<^'^)exp[-/3H(o-;0]/2'Ar,p(/3,d), where 

ZN,p{(3,d) = ^exph/3i/^(a,0] 

is called "partition function", ii. the average E performed over the quenched 
disordered couplings ^, in. the global expectation Euj{o) defined by the brackets 

Given these definitions, for the average energy of the system E we can write 

Also, we are interested in finding an explicit expression for the order parameters 
of the model, namely the averaged P Mattis magnetizations 
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{mn^lim Ec.(v.>^C>,)- (11) 



T 
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To this task we need to introduce the statistical pressure 



a(/3,d) = Jim — ln(Z^^p(/?, d)), 

AT-i-oo TV 

which is immediately related to the free energy per site f{f3,d) by the relation 
f{(i,d) — —a[[3,d)/(3 because, by maximizing a{/3,d) with respect to the P 
magnetizations (m'^), we get exactly the self consistence equations for these 
order parameters, whose solutions will give us a picture of the phase diagram. 
In the past decades, scientists involved in disordered statistical mechanics 
investigations, even beyond Artificial Intelligence, paved several strands for solv- 
ing this kind of problems, and nowadays a plethora of techniques is available. 
We extend early ideas of Guerra [11], on the line developed in [5D], consisting 
in modeling disordered statistical mechanics through dynamical system theory 
and in particular, here, we are going to proceed as follows: 



^Results outlined within this scahng can be extended with Uttle effort to the whole region 
P ~ N'^ , with 7 < 1, such that the constraint a = is preserved, as realized in the Willshaw 
model 1181 concerning neural sparse coding. 



Our statistical-mechanics problem is mapped into a diffusive problem embed- 
ded in a P-dimensional space and with given, known, boundaries. We solve the 
diffusive problem via standard Green-propagator technique, and then we will 
map back the obtained solutions in terms of their original statistical mechanics 
meaning. 

To this task, let us introduce and consider a generalized Boltzmann factor 
Bpf{yL,t) depending on P + 1 parameters x,t (which we think of as general- 
ized P dimensional Euclidean space and time) 



N 



N 



Bn{x, t- £,, <^) = exp — - ^ a,aj ^ ef C^^ + ^^i^Yl ^j'^J 



i¥^3 



and the generalized statistical pressure 



(12) 



aAr(x,i) = ^In 



^PAr(x,i;^,a) 
{<y} 



(13) 



Notice that, for proper values of x, i, namely x = and t ~ (3, classical statistical 
mechanics is recovered as 



a(B) = lim aN(^ — O,t — 0)— lim ^77 In 






(14) 

In the same way, the average (•)(x,t) will be denoted by (•), wherever evaluated 
in the sense of statistical mechanics, namely 



(o)(x,t) 



E{^}O(<T,0-BAr(x,t;?,O-) 



X;{^}SAr(x,t;^,o-) 
EMo(a,0exp[-/3i/(a,0 



E{.}exph/3iJ(a,4)] 



- (o)(x=0,t=/3)- 



It is immediate to see that the following equations hold: 

atajv(x,t) = 5Ep("iM)(x,t)' 
d^^aN{y^,t) = (m^)(x^t). 



(15) 
(16) 

(17) 



and, defining a vector rAr(x, t) of elements r^(x, t) = —dx aAr(x, t), by con- 
struction r^(x,i) obeys the following equation: 

p p 

9,r';,(x,t) + ^r5^(x,f)[a,,^r^(x,t)] ^ ^Y.^liy%{^.t), (is) 

which happens to be in the form of a Burgers' equation for the vector rjv(x, t) 
with a kinematic viscosity (2iV)~^. As it is well-known, the Burger equation 
can be mapped into a P-dimensional diffusive problem using the Cole-Hopf 
transformation |^ as follow: 



V'Ar(x,t) =exp 



-N J dx^T%i^,t) 



= exp[A^aAr(x,i)], 



(19) 



and its t and x streaming read off as 






(20) 



in such a way that 



^ Vw(x, t) = 7V^jv(x, t) {a2^,^aw(x, t) + 7V[9,^ajv(x, i)][5.„aw(x, i)]} 



Now, from equations (20), (21) we get 



5tV^(x,i)-^5][a2.^^(x,t) 



= 0. 



(21) 



(22) 



Therefore, we estabhshed a reformulation of the problem of calculating the 
thermodynamic potential a(/3, d) over the equilibrium configuration of the or- 
der parameters for an attractors network model in terms of a diffusion equation 
for the function ipN{x,t), namely the Cole-Hopf transform of the Mattis mag- 
netizations, with a diffusion coefficient D = {2N)^^, that is 



5t^jv(x,t)-i?V2^jv(x,f)=0, 
iAatIx, 0) = ^ exp f ^ a;^ ^ ^^aj 

{<t} ^ M J 



(23) 



We solve this Cauchy problem (23) through standard techniques: first, we map 



the diffusive equation in the Fourier space, then we calculate the Green prop- 
agator for the homogenous configuration, and finally we will inverse-transform 
the solution. 
Let us consider the Fourier transform: 



-0Ar(k, t) = /jjp d^x exp ( - I J2f_c x^,k^,)'ipN{yL, t), 
'0A'(x,t) = j^^ J^pd^kexp{iJ2fj,x^,k^)'iljN{Kt), 

and the related Green problem: 

dtG(k,t) + Dk^G{k,t) ^d{t), 



(24) 



(25) 



where G'(k, t) is the Green propagator in the fc-space, which can be decomposed 

as _ _ _ 

G(k,f)=Gfl(k,t) + G5(k,t), (26) 

being Gfl;(k, i) the general solution of the homogeneous problem and Gs(k, t) 
a particular solution of the non-homogeneous problem. Hence, the full solution 
will be 

V'jv(x,t)=/ d^a;'Gfl(x-x',t)V;jv(x',0), (27) 



(28) 



where the function Gi^(k,^) fulfills 



djGRik,t)~Dk^GR{k,t)=0, 
G'fl(k,0) = l, 



hence 



Therefore, we get 



G(k,t) ^e^p{-DkH), 



'ipNi^,t) 



G(x,f) = 



N 



(2VTrDt)' 



exp(45t) 



2nt) ' /(n'^oe^p[-^'^(^''^'^)]' 



' ^2 



$(x',x,t) 



2t 



Af 



ln2 Vln 

AT -^ 



i=i 



cosh ^ x;,C 



and 



aNix,t) = — ln[V^jv(x,t)] 



We can solve now the saddle-point equation 



a(x,t) = lim Q!jv(x, t) = Extr{$}, 



(29) 

(30) 

(31) 
(32) 

(33) 



where we neglected 0{N~^) terms, as we performed the thermodynamic limit. 
Finally, by replacing t = (3 and x = and x'^ = f3{m,y) (hence the original 
statistical mechanics framework), we obtain the following expressions for the 
statistical pressure 



«(/3) = fE(-'^^' 



2 ^^'" ^' 



ln(2) - ( In 



cosh (l3j2{m,)e 



(34) 



i 



whose extrcmization offers immediately the P desired self-consistency equations 
for all the (rriu). 



i.) = /rtanh(/3Ee(™M>)) V/ie[l,P], 



(35) 



where with the index ^ we emphasized once more that the disorder average over 
the quenched patterns is performed as well. 



Of course, the self-consistence equations ( 35 1 recover those obtained in [TH 
[TT] via different analytical techniques, where they were also shown to yield to 
the parallel ansatz (10), which, in turn, can be formally written as 



I/-1 



'^'-^'+E^«''n'^(^n, 



(36) 



i/=2 /j=l 



and it will be referred to as a''^\ 



The parallel ansatz ( 10 ) can be understood rather intuitively. To fix ideas 



let us assume zero noise level and that one pattern, say /i = 1, is perfectly 
retrieved. This means that the related average magnetization is mi = (1 — d), 
while a fraction d of spins is still available and they can arrange to retrieve a 
further pattern, say /x = 2. Again, not all of them can match non-null entries in 
pattern S,^ and the related average magnetization is TO2 = d{l — d). Proceeding 



in the same way, for all spins, we get the parallel state. Notice that, the number 
K of patterns which are, at least partially, retrieved does not necessarily equal 
P. In fact, due to discreteness, it must be d^~^{l — d) < 1/N, namely at least 
one spin must be aligned with ^^ , and this implies K < logiV. 

Such a hierarchical, parallel, fashion for alignment, providing an overall en- 
ergy (see Eq. [?]) 



ii;(P) = -A^^[(l - d)d'=-T + P = -N ^ \ '^^ ' + P, (37) 



is more optimal than a uniform alignment of spins amongst the available pat- 
terns, as this case would yield nik = (1 — d)/P for any k and an overall energy 

E<^>^-n{:(Y)\p^J1^,P. ,38) 

fc=l ^ ^ 

being (1 - d2+2P) > (1 _ d^yp_ 

On the other hand, as we will see in Sec. 3.1 when d> dc^ 1/2, the state ( 10 ) 

is no longer stable and spurious states do emerge. 

Before proceeding, it is worth stressing that, although the parallel state ( 10 1 
displays non-zero overlap with several patterns, it is deeply different, and must 
not be confused with, a spurious state in standard Hopfield networks. In fact, in 
the former case, at least one pattern is completely retrieved, while in spurious 
states, the overlap with each memory pattern involved is only partial. 
Moreover, in standard Hopfield networks, spurious states are somehow unde- 
sirable because they provide corrupted information with respect to the best 
retrieval achievable where one, and only one, pattern is exactly retrieved. Con- 
versely, in our model, the retrieval of more-than-one pattern is unavoidable (for 
finite d and /3 — >■ oo) and the quality of retrieval may be excellent (perfect) in 
the case of patterns poorly (not) overlapping. 

Finally, and most importantly, for /3 — ^ c» and in a wide region of dilution, the 
parallel state cr^^' corresponds to a global minimum for the energy. This is not 
the case for an arbitrary mixture of states. 

3 The emergence of spurious states 



In Sec. 2.1 we explained why we expect the parallel state (36) to occur, ex- 
ploiting the fact that each pattern tends to align as many spins among those 
still available. Actually, this intuitive approach yealds the correct picture for 
T = (no fast noise) and not-too-large d, while when either T or the degree 
of dilution are large enough, the system can relax to a state where only one 
pattern is retrieved or falls into a spurious state where several patterns are par- 
tially retrieved, but none exactly. These states are discussed in the following 
subsections and in Sec. |4]the analysis will be made quantitative. 

3.1 The failure of parallel retrieval 



Let us start from the noiseless case and consider the state (36) corresponding 



to the parallel ansatz (10): we notice that, on average, there exists a fraction 



2[(l--d)/2]^ of spins (Ji corresponding to the entries Q — l,^f = ~l,Vfc G [1,-?] 
(and analogously for the "gauged" case S^j = —1,^^ = +1) and expected to be 
aligned with the first entry ^f, in such a way that the overall field insisting on 
each of them is /li — mi— 111,2 — m^ — .... — mp. Of course, such spins are the most 
unstable, and, at zero noise level, they flip whenever hi happens to be negative, 
that is, when mi < X]/c=2"^'=- Exploiting the ansatz ruk = d!'~'^{l — d), this 



can be written as 



(f-d) 



d-dp^ 



l-d 



l-2d + d^, (39) 



which becomes negative for a value of dilution dc{P), which converges expo- 
nentially from above to 1/2 as P gets large. From this point onwards, the 
first pattern is no longer completely retrieved and the system fails to parallel 



retrieve (according to the definition in Eq. 36). Therefore, when d > dc{P), gen- 
uine spurious states emerge and the system relaxes to states which correspond 
to mixture of p < -P patterns, but none of them is completely retrieved (at least 



up to extreme values of dilution). As we will see in Sec. 4.4 the transition at 
dc{P) is first order. 

Moreover, from Eq. |39] we find that the case P — 2 has no solution in the 
range d E [0,1], meaning that the parallel- retrieval state is always a stable 
solution in the zero noise limit; on the other hand, dc{5) ~ 0.62, (ic(4) w 0.54 
and so on. 

Such phenomenology concerns relatively large degrees of dilution, yet, the 



presence of noise can also destabilize the true parallel-retrieval state ( 10 1 in the 
regime of small degrees of dilution. In fact, we expect that the spins aligned 
according to the fc-th pattern associated to a magnetization m^ = d''~^{l — d) 
will loose stability at noise levels T > d'^~^{l — d). In particular, at T > d{l — d), 
only one pattern will be retrieved and the pure state is somehow recovered. As 



we will see in Sec. 4.4 such estimates are correct for small d. 



3.2 Symmetric mixtures 

Typical spurious states emerging in standard associative networks are the so- 
called symmetric mixtures of p < P states, which can be described as 

a, = signf^erj, (40) 

and it will be referred to as a^^'. We anticipate that the symmetric mixture 
turns out to emerge also in the diluted model under investigation. 
Now, in the standard Hopfield model, odd mixtures of p patterns, are metastable, 
i.e. their energies are higher than those of the pure patterns, and, moreover, 
the smaller p and the more energetically favorable the mixture. On the other 
hand, even mixtures of p patterns are unstable (they are saddle-points of the 
energy). The instability of even mixtures is often associated to the fact that, for 
a macroscopic fraction of spins, a^^ is not defined due to the ambiguity of the 
sign. For instance, when p = 2, X]u=i ^f occurs to be null for half of the spins 
and the related values are defined stochastically according to the distribution 

P{a,)^liS.^+i+5,^+i). (41) 

10 



However, as we will show in Sec. |4.3[ this is not the case for this diluted model 
as it displays wide regions in the parameter space (d, T) where even and/or odd 
symmetric mixtures are stable. 

3.3 A "hybrid" spurious state 



As we will see in Sec. 4.3, the symmetric mixture a^^^ can become unstable 
and relax to a different spurious state which is a "hybrid" state between the 
symmetric mixture a'-^'> and the parallel state a''^\ 

To begin and fix ideas, let us set P = 3 and start from the state Ci = 
sign{^j^ +£,f +£_f). In the presence of dilution the argument ^j + ^f + ^f can be 
zero and in that situation one can adopt the following hierarchical rule: take 
'^i — ^i provided that £} ^ 0; otherwise, if £,} = 0, then take (Ji = ^f provided 
that ^f ^ 0; otherwise, if also £,f = 0, then take Ci = £_f provided that ^f ^ 0; 
otherwise, if also £,f ^ 0, then put oi — ±1 with probability 1/2. In this way 
we can built a state, generally defined for any P, and, being S = ^ ^[', it can 
written as 

o^^(\- <5H,o)sign(S) + fe,o[e' + h\■^^^^ + %,o'5cf ,o?f + •■•], (42) 

which will be referred to as a^^^ . 

The related average Mattis magnetizations can be calculated as the sum of 
one contribution ttiq (the same for any /i) deriving from the spins correspond- 
ing to non ambiguous sign function (i.e., S 7^ 0), and another contribution 
accounting for hierarchical corrections (i.e., S = 0). Let us focus on the first 
term: 

mo = (C^sign(S))^ = i^/sign(l + f]r)-sign(-l + f]^^)\(43) 
p p "' 

where, in the last step, we exploited the implicit symmetry in pattern entries 
and P(X]i/=^uC ^ 1) represents the probability that the specified inequality is 
verified over the distribution ^h. The latter quantity can also be looked at as 
the probability for a symmetric random walk with holding probability d to be 
at distance ^ 1 from its origin after a time span P — 1. Hence, we get 

mo = (1 - d)[P(0 ^ 0, P - 1) + P(0 ^ 1, P - 1)], (45) 

where P(a;o -> a;, i) is the probability for a symmetric random walk with stop- 
ping probability d to move from site xq to site x vnt steps, namely 

\ — d 



P(xo^x.t)^ V —, f-^ :-d' i . (46) 



The second contribution to the magnetization is (1 — d) J2k=i "^(^ ~^ ^^P' 
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Finally, by summing the two contributions we find the following expressions 
for P = 3 



mi = hl + d-3(f +(f), 
1712 = ^(l-d)(l + rf2), 



ma 



-(l-3d + 5d2-3rf3 



and for P 



mi 

1712 

m-i 
m^ 
m5 



-(3 + 9d-42d2 + 74d3 

8 



(l-rf)(3 + 6rf2-rf*) 
{1 - d){2, - Ad + IM^ 
{1 - d){2, - Ad + IM^ 
{1 - d){2, - Ad + IM^ 



20(^3 + lid''), 
28d^ + 19d''), 
36d''' + 27d''). 



(47) 
(48) 
(49) 

(50) 
(51) 
(52) 
(53) 
(54) 



The expressions for arbitrary P can be analogously calculated exactly and some 
examples are shown in Fig. 1. 
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Figure 1: Mattis magnetizations m versus dilution d, according to the analytical 
expression derived in Sec. |3.3| Each panel refers to a different value of P, as 
specified. 



We expect a^ to become globally stable in the region of very large dilutions 
{d > dniP))', intuitively, dilution must be large enough to make magnetizations 
rather close to each other in such a way that the least signalled spins cor- 
responding to (—,—,...,— ,+,+,..., +) (overall (P — l)/2 negative entries and 
(P + l)/2 positive entries) are stable. This means J2ii^ ^ fe,o)sign(S)^-'/iV > 
Eii'i'^^\k{P + l)/{P~k), where ipk = 2j:,[il-d)/2]^^dP-^'{P~ky./m- 
1)!(P — k — 21 + 1)!] and P is odd. This condition is fulfilled for values of dilu- 
tion larger than dniP), which converges to 1 as P gets larger, hence, in order to 
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tackle this limit, dilution must become a function of the system size d -> d{N). 
In this case the network itself becomes diluted as well and different techniques 
are required; this will not be discussed in this paper. 

4 Stability analysis on the organization of the 
states 



The set of solutions for self-consistent equations (35) describes states whose 
stability may vary strongly. In fact, provided the network has reached them, in 
the noiseless limit (of whatever kind) it would persist in those states. However, 
the equations do not contain any information about whether the solutions will 
be stable against small perturbations, that is to say if the system will indeed 
really thermalize on these states or will fall apart more or less quickly. In order 
to evaluate their stability we need to check the second derivative of the free- 
energy [T]. More precisely, we further need to build up the so called "stability 
matrix" A with elements 

^'•"-P^- (55) 

Then, we evaluate and diagonalize A at a point m, representing a particular 



solution of the self-consistence equations ( 35 ) , in order to determine whether m 
is stable or not. Being {-E'^}^=i,...,p, the set of related eigenvalues, m is stable 
whenever all of them are positive. 



Now, from Eq. 34 and |55[ remembering that a{/3,d) = —/3f(/3,d), we find 
straightforwardly 

A'^'^ = [l-;3(l-d)](5^'^+/3Q^^ (56) 

where 

Q^^ = (^'^rtanh2(/3^ .?^))^. (57) 

Of course when d = we recover Af"" = (1 - /3)5^'^ + {^^'^^ tanh^{P~t ■ ^))^, 
namely the result known for the standard Hopfield model. 

We now consider several states, known to be solutions of self-consistence 



equations ( 35 ) and check their stability. In this way we will find constraints in 
the region (T, d) where those states are stable and then we will build up the 
phase diagram. 

4.1 Paramagnetic state 

Let us start with the paramagnetic state, which is described by 

^ = "cf ; (58) 

this state trivially fulfills Eq. |35j 

By replacing this expression in Eq.s [56] and [57] we find 

^^'' = <5p.[l-/3(l-d)]. (59) 

Therefore, in this case, A is diagonal and its eigenvalues are directly E^ — 
y^MM = 1 — /3(1 — d), Vi/ e [1, -P]- We can conclude the paramagnetic state exists 
and is stable in the region 1 — /3(1 — d) > 0, that is (remembering that T = /3~^) 

PM stability ^ T > I - d. (60) 

This region is highlighted in Fig. [2] 



13 




Figure 2: (Color on line) In the parameter space (T, d) we highlighted the region 
where the paramagnetic state exists and is stable. As proved in Sec. |4.1[ this 
region includes points fulfilling T > 1 — d; notice that this result is independent 
of P. 



4.2 Pure state 

Let us now consider the pure state, that is any of the P configurations 

^ = m(l,lt), (61) 



m being the extent of the overlap, which, in general, depends on d and on T. 
The related self-consistence equations are 



m^ = (l-d)tanh(/3m^). 



(62) 
(63) 



The first equation has solution in the whole half-plane T > 1 — d, and this 
ensures that, in the same region, the pure-state exists. In order to check its 
stability, we calculate the stability matrix finding 



A^'^' = l-/3(l-d)[l-tanh2(/3m^)] 

A""" = l-/3(l-d)[l-(l-rf)tanh2(^m^)]. 



(64) 
(65) 
(66) 



Therefore A is diagonal and the eigenvalues are E^ = A^*^ and E^, = A'^'^ . 
Notice that these eigenvalues do not depend on P and that £'^ > Ei,, so that 
the analysis can be restricted on E^. Requiring the positivity for E,y, we get 
the region in the plane (T, d), where the pure state is stable; such a region is 
shown in Fig. Is] We stress that this result is universal with respect to P (in the 
low-storage regime). 
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Figure 3: (Color on line) In the parameter space (T, d) we highlighted the region 
where the pure state exists and is stable. This result was found by numerically 
solving the self-consistence equation Eq. [35] and the inequality E^, > 0, where 



El, is the smallest eigenvalues of the stability matrix A (see Eq. 66 1 ; notice that 
this result is independent of P. 



4.3 Symmetric state 

A symmetric mixture of states corresponds to configurations leading to 

7^ = TO(d,T)(l,l,l,...,l,0,...,0), (67) 

where p < P order parameters are equivalent and non null, while the remaining 
P — p are vanishing. 

Let us start with the case p = P = 3, yielding m = m{d, T)(l, 1, 1). In this 
special case the three self-consistence equations collapse on 



m{d, T) 



l-d 



2\ ^ '^ \ J2i„„i,2 



[tanh^(3/3TO) +tanh^(/3TO)] 
d^t&\ih^{l3m) 



l-d 



tanh^(2^m) 



and the matrix A reads as 




(69) 



a and h being parameters related to m, d and /3. More precisely, the eigenvalues 
of A are {a -\-2h,a — h,a — h), which can be written as 



l-d 



a-b=l-(3{l-d)+2(i[iiiiih^{2fim)d 



+tanh^(/3m) 



d2(l - d) 



l-d 
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a + 26 = l-/?(l-d) + 2^<^ tanh2(3/3m)3 



l-d 



1-d 



8dl3 ta,nh^ {2/3m^ ' 



■ tanh^(/3TO) 



r^2 



d^il-d) 



(70) 



The conditions for the existance and the stabihty of the symmetric, odd mixture 
with p = P = 3, yield a system of equations which was solved numerically and 
the region were such conditions are all fulfilled is shown in Fig. 4. Notice that the 
region is actually made up of two disconnected parts, each displaying peculiar 
features, as explained later. 
This result is robust with respect to P, being P odd and p = P. 





Figure 4: (Color on line) In the parameter space (T, d) we highlighted the region 
where the symmetric state a''^\ for the special case p ^ P = 3, exists and is 
stable. Notice that two disconnected regions emerge: the one corresponding 
to lower values of dilution derives from the fact that p is odd, while the one 
corresponding to larger values of dilution from the fact that p = P. 



We can further generalize the analysis by considering P > p, still being p 
odd. In this case we get the following stability matrix 



I a b h <d\ 

b a b 

b b a 

V c y 



(71) 
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Figure 5: In this plot we focused on the region of the parameter space {T,d), 
where odd symmetric spurious state exist and are stable. In particular, we chose 
P = 1 and we considered any possible odd mixture, i.e. p — i, p — ^ and p = 7; 
each value of p is represented by a different curve. Notice that the smaller p 
and the wider the region, analogously to the standard Hopfield model. 



with eigenvalues {a — h,a — h,a + 25, c) , where 
c = l-/3(l-d) 



1-2 



l-d 



[tanh^(3TO) + 3tanh^(m)] + d 



l-d 



X 3tanh^(2TO) + 3 d^tanh^(TO) 

-, _ , \ 3 

1 - 2 ( J [tunh^ {3 (3m) + 3tanh2(^m)] 



M 



l-d 



tanh^(2/3TO) 



l-d 



-rf'^tanh' 



'(/3m)]} 



(72) 



has degeneracy P — p. 

Such states {p < P, p odd) are stable only at small d. This is due to the 
fact that the eigenvalue c occurs only when p < P and it reads as (/i > p): 



^MM 



= [1 - /3(1 - d}] + l3{{ef)^{t^nh'[(3mJ2a)i 

V 

p 
= [l-/3(l-(i)]+/3(l-(i)(tanh2[/3m^C''])c- (73) 



Thus, one can see that the r.h.s term contains factors (1 — d) at least of second 
order in such a way that when d is close to 1, i.e. for high dilution, and T < 1 — d, 
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such term becomes negative. On the other hand, in the case fJ. < p, we get 

and therefore the r.h.s term contains even first order term (1 — cf), which are 
comparable with /3(1 — £). 

Moreover, we find that the p-component, odd symmetric state exists and 
is stable in a region of the space (T, rf), which gets smaller and smaller as p 
grows (see Fig. 5). The emergence of such states can be seen as a feature of 
robustness of the standard Hopfield model with respect to dilution. 

Finally, the case P — p always admits a region of existence and stability in 
the regime of high dilution. The latter region is independent of the parity and 
depends slightly on P (see Fig. 5). The emergence of such states is due to the 
failure of hierarchical retrieval, namely uniformity prevails. 




Figure 6: In this plot we focused on the region of the parameter space {T,d), 
where symmetric spurious state with p = P exist and are stable. In particular, 
we chose P = 7 and we considered any possible mixture, i.e. p = 3, p = 4, 
p — 5, p = 6 and p ~ 7; each value of p is represented by a different curve. 
Notice that the smaller p and the wider the region, yet the region tends to an 
"asymptotic shape". 



4.4 Parallel state 

The parallel-retrieval state can be looked at as the extension to arbitrary values 
of d of the pure state holding for the special case T — 0. We recall that in the 
noiseless limit the parallel-retrieval state can be described as 

?7t = (1 - d, (1 - d)d, (1 - d)d'^, ..., (1 - d)d^). (74) 

In this case the stability matrix is diagonal with terms: 

A^^ = 1 - /3(1 -d)+ P{{ef tanh2[/?(l - d){(} + df + ... + d^^f)]), (75) 

and, consistently, taking the limit /3 — > oo, we get the simplified form 

A^^^iim ^i-p{i-d)+p{{£,^f{i-5[{e+de+-+dpe)]))- m 

p— J-OO 
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Now, the third term in the r.h.s. is either /3((i^^)^) — /3(1 — d) (when the 
polynomial of order P is zero) or 0; the latter case would trivially yield A^^ < 0. 
Therefore, in the limit /3 — >■ cx) the stability of the parallel-retrieval state is 
constrained by the smallest real root G [0, 1] of the polynomial £,^+d^'^+...+d^(,^ 
with ^' = 1, 0, —1. This corresponds to ^^ = 1 and ^* = — 1, Vi > 1, under gauge 
symmetry and returns the same result found, from a more empirical point of 
view, in Sec. 3.1 More precisely, the critical dilution converges exponentially 
to 1/2 as P grows. 

In particular, for P = 3 we find that the parallel-retrieval state exists and 
is stable in the interval d g (0, ^%^) ~ (0,0.618). The point dc(3) = ^%^ 
corresponds to the unique real root in (0, 1). 

When noise is introduced, the critical dilution dc, separating the parallel- 
retrieval state from spurious states, is shifted towards larger values, as suggested 
by Eq. |75| On the opposite side, namely in the regime of small dilution, the 
parallel state is progressively depleted and, as the temperature is increased, 
magnetizations vanish, starting from mp, and proceeding up to m2. One can 
distinguish a set of temperatures Tp{d) < Tp^i{d) < ... < T2{d) < Ti{d), such 
that when T > Txid), all magnetizations mi^Wi < K are null on average. Hence, 
above T2{d) the pure state retrieval is recovered, while above Ti{d) — 1 — d the 
paramagnetic state emerges. 




Figure 7: In this plot we focused on the region of the parameter space (T, d) , 
where parallel retrieval states exist and are stable. In particular, we chose P = 5 
and we considered any possible state with /c = 2, A: = 3, fc = 4 and fc = 5 non-null 
magnetization. 



In Fig. 7 we highlight the region of the parameter space (T, d) where such 
parallel states exist and are stable. This was obtained numerically for the case 
P — 5; for larger values of P the region is slightly restricted to account for the 
shift in dc. 

Finally, the results collected so far are used to depict the phase diagrams for 



19 



P = 2, P = 3 and P = 5 (see Fig. [s) from left to right). 



Figure 8: Phase diagram obtained from the analysis described in Sec.|4J Each 
panel refers to a different value of P, namely P = 2 (leftmost panel), P — 3 
(middle panel) and P = 5 (rightmost panel) . These theoretical predictions were 
also successfully compared with results from numerical simulations. Notice that 
when P > 3, the region between the parallel states and the symmetric states 
includes spurious states, which are, in general a combination of the hybrid state 
and of the parallel state. 



5 Discussion 

In this work we explored the retrieval capabilities of the multitasking associa- 
tive network introduced in [Mj. Such a system is characterized by (quenched) 
patterns which display a fraction d of null entries: interestingly, by paying the 
price of reducing the amount of information stored within each pattern (by a 
fraction d), we get a system able to retrieve several patterns at the same time. 
Thus, this constitutes a model of a low information parallel processor; such a 
system can indeed be a good toy-model for all the phenomena where coordi- 
nated multitasking features are expected as for instance in adaptive immune 
networks or peripheral nervous systems [T7| [H] . 

At zero noise level (T = 0), and for a relatively low degrees of dilution, the 
system converges to an equilibrium state characterized by overlap m = ((1 — 
d), {l — d)d, ..., (1 — (i)rf'^, {l — d)d^^^), where P is the number of stored patterns. 
Although this state displays non-null overlap with several patterns, it does not 
represent a spurious state, as can be seen by noticing, for instance, that this 
state allows the complete retrieval of at least one pattern. However, through a 
careful inspection, we proved in this paper that there are regions in the (T, d) 
plane where genuine spurious states occur, hence a clear picture of the phase 
diagram becomes a fundamental issue in order to make the model ready for 
practical implementations. 

A remarkable difference with respect to standard (serial processing) neural net- 
works lies in the stability of mixture states: both even and odd mixtures are 
stable, which -within the world of spurious states - was a somewhat desired, and 
expected, result as there is neither a biological reason, nor a prescription from 
robotics, to weight differently odd and even mixtures (whose difference lies in 
the gauge invariance of the standard Hopfield model, which is broken within 
our framework due to the partial blankness of the pattern entries). Another 
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expected feature, which we confirmed in this paper, is the emergence of parallel 
spurious states beyond standard ones from classical neural network theory: This 
is the natural generalization of the latter when moving from serial to parallel 
processing. 

Beyond these somehow attended results, the phase diagram of the model is 
still very rich and composed by several not-overlapping regions where the re- 
trieval states are deeply differently structured: Beyond the paramagnetic state 
and the pure state, the system is able to achieve both a hierarchical organiza- 
tion of pattern retrievals (for intermediate values of dilution) and a completely 
symmetric parallel state (for high values of dilution) , which act as the basis for 
the outlined mixtures when raising the noise level above thresholds whose value 
depends on the load of the network P. 

These findings have been obtained developing a new strategy for computing the 
free energy of the model by which, imposing thermodynamic principles (hence 
extremizing the latter over the order parameters of the model), self-consistency 
has been obtained. The whole procedure has been strongly based on techniques 
stemmed from partial differential equation theory. In particular, the key idea 
is showing that the noise-derivatives of the statistical pressure obey Burgers' 
equations, which can be solved through the Cole-Hopf transformation. The lat- 
ter maps the evolution of the free energy over the noise into a diffusion problem 
which can be addressed through standard Green integration in momenta space 
and then pushed back in the original framework. 

In the future, effort must still be spent in order to achieve a clear scenario in 
the hyper-diluted regime, namely where the dilution scales as a function of the 
volume (the amount of neurons), which can not be accomplished through the 
techniques we presented here as saddle point integration is no longer useful. We 
plan to report on this research soon. 

This work is supported by FIRB grant RBFR08EKEV. 
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